AITopics | Ingolstadt

Collaborating Authors

Ingolstadt

c82836ed448c41094025b4a872c5341e-Supplemental.pdf

Neural Information Processing SystemsFeb-11-2026, 03:23:29 GMT

Recently there has been significant theoretical progress on understanding the convergence andgeneralization ofgradient-based methods onnonconvexlosses withoverparameterized models. Nevertheless, manyaspectsofoptimization and generalization and in particular the critical role of small random initialization are not fully understood.

artificial intelligence, machine learning, xxt ututt, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > New York > New York County > New York City (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > Maryland > Baltimore (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Ingolstadt (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

c82836ed448c41094025b4a872c5341e-Paper.pdf

Neural Information Processing SystemsFeb-11-2026, 03:23:25 GMT

artificial intelligence, initialization, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > New York > New York County > New York City (0.05)
Asia > Middle East > Jordan (0.05)
North America > United States > Maryland > Baltimore (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Ingolstadt (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.52)

Add feedback

96ca792fddef7c1e3366c405022463cb-Paper-Conference.pdf

Neural Information Processing SystemsFeb-10-2026, 22:06:19 GMT

evaluation, mdp, point mdp, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
Asia > Middle East > Jordan (0.04)
North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Ingolstadt (0.04)

Genre:

Research Report (0.68)
Instructional Material (0.46)

Industry:

Health & Medicine (0.68)
Transportation > Infrastructure & Services (0.50)
Transportation > Ground > Road (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

16bda725ae44af3bb9316f416bd13b1b-Supplemental.pdf

Neural Information Processing SystemsFeb-7-2026, 15:35:45 GMT

algorithm, convergence rate, inequality, (16 more...)

Neural Information Processing Systems

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Ingolstadt (0.04)
Asia > Japan > Honshū > Tōhoku (0.04)

Genre: Research Report > New Finding (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

16bda725ae44af3bb9316f416bd13b1b-Paper.pdf

Neural Information Processing SystemsFeb-7-2026, 15:35:41 GMT

However, since this proof relies on the existence of a convergent subsequence, their proof does not reveal any rate forglobal convergence.

algorithm, artificial intelligence, machine learning, (19 more...)

Neural Information Processing Systems

Country:

Europe > Germany > Bavaria > Upper Bavaria > Ingolstadt (0.04)
Asia > Japan > Honshū > Tōhoku (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.93)

Add feedback

05b69cc4c8ff6e24c5de1ecd27223d37-Paper-Conference.pdf

Neural Information Processing SystemsFeb-7-2026, 10:15:40 GMT

activation function, approximation, neural network, (14 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Germany > Bavaria > Upper Bavaria > Ingolstadt (0.04)
North America > United States > District of Columbia > Washington (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.46)

Industry: Energy (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Optimization, Generalization and Differential Privacy Bounds for Gradient Descent on Kolmogorov-Arnold Networks

Wang, Puyu, Zhou, Junyu, Liznerski, Philipp, Kloft, Marius

arXiv.org Machine LearningFeb-5-2026

Kolmogorov--Arnold Networks (KANs) have recently emerged as a structured alternative to standard MLPs, yet a principled theory for their training dynamics, generalization, and privacy properties remains limited. In this paper, we analyze gradient descent (GD) for training two-layer KANs and derive general bounds that characterize their training dynamics, generalization, and utility under differential privacy (DP). As a concrete instantiation, we specialize our analysis to logistic loss under an NTK-separable assumption, where we show that polylogarithmic network width suffices for GD to achieve an optimization rate of order $1/T$ and a generalization rate of order $1/n$, with $T$ denoting the number of GD iterations and $n$ the sample size. In the private setting, we characterize the noise required for $(ε,δ)$-DP and obtain a utility bound of order $\sqrt{d}/(nε)$ (with $d$ the input dimension), matching the classical lower bound for general convex Lipschitz problems. Our results imply that polylogarithmic width is not only sufficient but also necessary under differential privacy, revealing a qualitative gap between non-private (sufficiency only) and private (necessity also emerges) training regimes. Experiments further illustrate how these theoretical insights can guide practical choices, including network width selection and early stopping.

artificial intelligence, machine learning, theorem 4, (18 more...)

arXiv.org Machine Learning

2601.22409

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany > Rhineland-Palatinate > Kaiserslautern (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Ingolstadt (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.85)

Add feedback

On The Role of Pretrained Language Models in General-Purpose Text Embeddings: A Survey

Zhang, Meishan, Zhang, Xin, Zhao, Xinping, Huang, Shouzheng, Hu, Baotian, Zhang, Min

arXiv.org Artificial IntelligenceNov-27-2025

Text embeddings have attracted growing interest due to their effectiveness across a wide range of natural language processing (NLP) tasks, including retrieval, classification, clustering, bitext mining, and summarization. With the emergence of pretrained language models (PLMs), general-purpose text embeddings (GPTE) have gained significant traction for their ability to produce rich, transferable representations. The general architecture of GPTE typically leverages PLMs to derive dense text representations, which are then optimized through contrastive learning on large-scale pairwise datasets. In this survey, we provide a comprehensive overview of GPTE in the era of PLMs, focusing on the roles PLMs play in driving its development. We first examine the fundamental architecture and describe the basic roles of PLMs in GPTE, i.e., embedding extraction, expressivity enhancement, training strategies, learning objectives, and data construction. We then describe advanced roles enabled by PLMs, including multilingual support, multimodal integration, code understanding, and scenario-specific adaptation. Finally, we highlight potential future research directions that move beyond traditional improvement goals, including ranking integration, safety considerations, bias mitigation, structural information incorporation, and the cognitive extension of embeddings. This survey aims to serve as a valuable reference for both newcomers and established researchers seeking to understand the current state and future potential of GPTE.

information retrieval, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2507.20783

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > China > Guangdong Province > Shenzhen (0.05)
Asia > China > Heilongjiang Province > Harbin (0.04)
(17 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.46)
Research Report > Promising Solution (0.45)

Industry:

Health & Medicine (1.00)
Information Technology > Security & Privacy (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
(2 more...)

Add feedback

SAGE: An Agentic Explainer Framework for Interpreting SAE Features in Language Models

Han, Jiaojiao, Xu, Wujiang, Jin, Mingyu, Du, Mengnan

arXiv.org Artificial IntelligenceNov-27-2025

Large language models (LLMs) have achieved remarkable progress, yet their internal mechanisms remain largely opaque, posing a significant challenge to their safe and reliable deployment. Sparse autoencoders (SAEs) have emerged as a promising tool for decomposing LLM representations into more interpretable features, but explaining the features captured by SAEs remains a challenging task. In this work, we propose SAGE (SAE AGentic Explainer), an agent-based framework that recasts feature interpretation from a passive, single-pass generation task into an active, explanation-driven process. SAGE implements a rigorous methodology by systematically formulating multiple explanations for each feature, designing targeted experiments to test them, and iteratively refining explanations based on empirical activation feedback. Experiments on features from SAEs of diverse language models demonstrate that SAGE produces explanations with significantly higher generative and predictive accuracy compared to state-of-the-art baselines.an agent-based framework that recasts feature interpretation from a passive, single-pass generation task into an active, explanationdriven process. SAGE implements a rigorous methodology by systematically formulating multiple explanations for each feature, designing targeted experiments to test them, and iteratively refining explanations based on empirical activation feedback. Experiments on features from SAEs of diverse language models demonstrate that SAGE produces explanations with significantly higher generative and predictive accuracy compared to state-of-the-art baselines.

explanation, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2511.2082

Country:

Europe > Austria > Vienna (0.14)
Asia > China (0.05)
North America > United States > New Jersey (0.04)
(6 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

TrueCity: Real and Simulated Urban Data for Cross-Domain 3D Scene Understanding

Nguyen, Duc, Lai, Yan-Ling, Zhang, Qilin, Gyawali, Prabin, Schwab, Benedikt, Wysocki, Olaf, Kolbe, Thomas H.

arXiv.org Artificial IntelligenceNov-11-2025

3D semantic scene understanding remains a long-standing challenge in the 3D computer vision community. One of the key issues pertains to limited real-world annotated data to facilitate generalizable models. The common practice to tackle this issue is to simulate new data. Although synthetic datasets offer scalability and perfect labels, their designer-crafted scenes fail to capture real-world complexity and sensor noise, resulting in a synthetic-to-real domain gap. Moreover, no benchmark provides synchronized real and simulated point clouds for segmentation-oriented domain shift analysis. We introduce TrueCity, the first urban semantic segmentation benchmark with cm-accurate annotated real-world point clouds, semantic 3D city models, and annotated simulated point clouds representing the same city. TrueCity proposes segmentation classes aligned with international 3D city modeling standards, enabling consistent evaluation of synthetic-to-real gap. Our extensive experiments on common baselines quantify domain shift and highlight strategies for exploiting synthetic data to enhance real-world 3D scene understanding. We are convinced that the TrueCity dataset will foster further development of sim-to-real gap quantification and enable generalizable data-driven models. The data, code, and 3D models are available online: https://tum-gis.github.io/TrueCity/

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2511.07007

Country: